The \(t\) distribution

Used to estimate the mean when you have a small sample drawn from a nearly normal population.

Conditions

  • Independent observations (\(n < .1N\))
  • Nearly normal population distribution (check distribution of sample)

\(t\) versus normal

The \(t\) has heavier tails than the normal distribution.

Degrees of Freedom

The number of parameters that are free to vary, without violating any constraint imposed on it.

Parameters

\(\mu\)


Since \(\bar{x} = \frac{1}{n}\sum_{i = 1}^n x_i\), one of our observations is contrained, leaving \(n-1\) that are free to vary.

\[ df = n - 1\]

Hypothesis testing

  1. State hypotheses: eg. \[ H_0: \mu = 4; \quad H_A: \mu \ne 4\]
  2. Check conditions
    • Independent observations
    • Nearly normal population
  3. Computer observed \(t\)-statistic \[ t_{obs} = \frac{\bar{x} - \mu}{s/\sqrt(n)} \]
  4. Draw picture to assess where \(t_{obs}\) falls in \(t_{df = n - 1}\)
  5. Compute a two-tailed p-value
  6. State conclusion

Confidence intervals

point estimate \(\pm\) margin of error

\[ \bar{x} \pm t^*_{df} \times SE \]

  • \(\bar{x}\): point estimate of \(\mu\).
  • \(t^*_{df}\): critical value that leaves \(\alpha\) in the tails of a \(t\) with \(df = n - 1\).
  • \(SE\): standard error of \(\bar{x}\), \(s/\sqrt(n)\).

Finding p-vals and \(t^*_{df}\)

pt(-2.2, df = 18)
## [1] 0.0206
qt(.025, df = 18)
## [1] -2.1

Postscript

Meet William Sealy Gosset.


Problem: A batch of beer should have a fixed [chemical level related to barley] in order to be of good quality. Can you test a small number of barrels and infer if the entire batch is of good enough quality?